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Abstract 

The number of "carries" when n random integers are added forms a Markov 
chain 1 23j . We show that this Markov chain has the same transition matrix as 
the descent process when a deck of n cards is repeatedly riffle shuffled. This 
gives new results for the statistics of carries and shuffling. 

1 Introduction 

In a wonderful article in this monthly, John Holte 23] found fascinating mathemat- 
ics in the usual process of "carries" when adding integers. His article reminded us 
of the mathematics of shuffling cards. This connection is developed below. 
Consider adding two 50-digit binary numbers: 

i mil moo oino oiooo ooooi oom 10111 ooooo 01111 mo 

01101 11110 10111 00110 00000 10011 11011 10001 00011 11010 
10111 01011 00011 10101 11110 10001 01000 11010 10101 01111 
1 00101 01001 11010 11011 11111 00101 00100 01011 11001 01001 

For this example, 28/50=56% of the columns have a carry of 1. Holte shows 
that if the binary digits are chosen at random, uniformly, in the limit 50% of 
all the carries are zero. This holds no matter what the base. More generally, if 
n integers (base 6) are produced by choosing their digits uniformly at random in 
{0, 1, • • • ,6—1}, the sequence of carries kq — 0, n\, K2, ■ ■ ■ is a Markov chain taking 
values in {0, 1, 2, • ■ • , n — 1}. Holte begins by deriving the transition matrix between 
successive carries k, k' . 



Here, < i, j < n — 1 and Xi, X2, ■ • ■ ,X n are independent and uniformly 
distributed on {0, 1, • • ■ ,6—1}. 

(H2) When b — 2, for any n, the transition matrix is 



(HI) P(i,j) 



P(k' = j\n = i)=P{jb<i + X 1 + --- + X n < {j + 1)6 - 1} 
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(H3) For n = 3, for all b 



. fb 2 + 3b+2 Ab 2 - 4 b 2 - 36 + 2 s 

P ^^ = 7^\ b2 - 1 462 + 2 1)2 - l 

' \b 2 -3b+2 Ab 2 - 4 b 2 + 3b + 2. 



These are the "amazing matrices" of Holte's title. Among many things, Holte shows 

(H4) The matrix P(i,j) of (HI) has stationary vector ir n (j) (left eigenvector with 
eigenvalue 1) independent of the base b: 

A(n,j) 

with A(n,j) the Eulerian number. This may be defined as 

(H4') A(n,j) is the number of permutations in the symmetric group S n with j- 
descents. Recall that a G S n has a descent at i if a(i + 1) < a(i). So 513 2 4 
has two descents. 

(H4") A(n,j) is the coefficient of x^ +1 in the polynomial p n (x) where 

Pn(x) 



i=0 



X 



(1 - a;)^ 1 ' 



(H4'") 4(n, j) = Eio(-l)'(T)0' + 1 - f) n - 

Definition (H4') is most relevant to the present paper. (H4") is equivalent to Wor- 
pitzky's identity. It has many proofs and appearances, e.g., to juggling sequences 
[111 ]. Finally, (H4'") goes back to Euler. An elementary development of these ideas 
is in [l2|. 

When n = 2,4(2,0) = 4(2,1) = 1, thus tt 2 (0) = tt 2 (1) = 1/2 is the limiting 
frequency of carries when two long integers are added. When n = 3, 4(3,0) = 
1, 4(3, 1) - 4, 4(3, 2) = 1, giving vr 3 (0) = 1/6, tt 3 (1) - 2/3, 7r 3 (2) = 1/6. 

Holte further shows 

(H5) The matrix P(i,j) of (HI) has eigenvalues 1, 1/6, 1/b 2 , ■ ■ ■ , 1/6"" 1 with ex- 
plicitly computable eigenvectors independent of b. 

(H6) Let Pb denote the matrix in (HI). Then for all real o, b 

P a Pb = Pab- 

When we saw properties (H4), (H5), (H6), we hollered 'Wait, this is all about 
shuffling cards!" Knowledgeable readers may well think, "For these two guys, ev- 
erything is about shuffling cards." While there is some truth to these thoughts, we 
justify our claim in the next section. Following this we show how the connection 
between carries and shuffling contributes to each subject. The rate of convergence 
of the Markov chain (HI) to the stationary distribution 7r„ is given in IScction 41 
the argument shows that the matrix P is totally positive of order 2. Finally, we 
show how the same matrix occurs in taking sections of generating functions 
discuss carries for multiplication, and describe another "amazing matrix" . 

Our developments do not exhaust the material in Holte's article, which we en- 
thusiastically recommend. A "higher math" perspective on arithmetic carries as 
cocycles [24| suggests many further projects. We have tried to keep the presen- 
tation elementary, and mention the (more technical) companion paper (TB | which 
analyzes the carries chain using symmetric function theory and gives analogs of our 
main results for other Coxeter groups. 
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2 Shuffling Cards 



How many times should a deck of n cards be riffle shuffled to thoroughly mix it? For 
an introduction to this subject, see 13, l27ll. The main theoretical developments are 
in [B|, with further developments in [l9l. [2pj . A survey of the many connections 
and developments is in [ijj]. The basic shuffling mechanism was suggested by [2lj ]. 
It gives a realistic mathematical model for the usual method of riffle shuffling n 
cards: 

• Cut off C cards with probability (™) /2 n , < C < n. 

• Shuffle the two parts of the deck according to the following rule: if at some 
stage there are A cards in one part and B cards in the other part, drop the 
next card from the bottom of the first part with probability A/(A + B) and 
from the bottom of the second part with probability B/(A + B). 

• Continue until all cards are dropped. 

Let Q{cr) be the probability of generating the permutation a after one shuffle, 
starting from the identity. Repeated shuffling is modeled by convolution: 

Q 2 (a) = QivM^V- 1 ), Q h (r) = £ O^fa)^" 1 )- (1) 
n 

Thus to be at a after two shuffles, the first shuffle goes to some permutation rj and 
the second must be to arj^ 1 . The uniform distribution is U(a) = l/nl. Standard 
theory shows that 

Q'\<r) -> U(a) as h -> 00. (2) 

The references above give useful rates for the convergence in {2j) showing that it 
takes h — 3/21og 2 n + c to get 2~ c close to random. When n — 52, this becomes 
h = 7 shuffles. 

To explain the connection with carries, it is useful to have a second description of 
shuffling. Consider dropping n points uniformly at random into [0, 1}. Label these 
points in order xm < £(2) ■ ■ ■ < %(n)- The Bakers transformation x 1— > 2x (mod 1) 
maps [0, 1] into itself and permutes the points. Let a be the induced permutation. 
As shown in the chance of a is exactly Q(cr). A natural generalization of this 
shuffling scheme to "6-shuffles" is induced from ink (mod 1) with b fixed in 
{1, 2, 3, • • • }. Thus ordinary riffle shuffles are 2-shuffles and a 3-shuffle results from 
dividing the deck into three piles and dropping cards sequentially from the bottom 
of each pile with probability proportional to packet size. 

Let Qb(cr) be the probability of a after a 6-shuffle. From this geometric descrip- 
tion, 

Qa* Qb = Qab- (3) 

The Gilbert-Shannon-Reeds measure is Q2 in this notation and we see that Q\ — 
Q 2 h . Thus to study repeated shuffles, we need only understand a single 6-shufflc. 
A main result of [fj is a simple formula: 

{ n-\-b— r\ 

Q b (a) = L-grl. (4) 
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Here r = r(a) = 1 + ^{descents in (er -1 )}. 

In addition to the similarities between (H6) and ([3]), [5] and [2§| proved that 
the eigenvalues of the Markov chain induced by Qt are 1, 1/fe, l/b 2 , ■ ■ ■ , l/6 n_1 . 
This and the appearance of descents convinced us that there must be an intimate 
connection between carries and shuffling. The main result of this article makes this 
precise. 

Theorem 2.1. The number of descents in successive b-shuffles of n cards forms a 
Markov chain on {0, 1 , • • • , n — 1} with transition matrix P(i,j) of (HI ). 

3 Bijective Methods 

First we describe some notation to be used throughout. The number of descents 
of a permutation r is denoted by d(r) . Label the columns of the n numbers to be 
added mod b by C\, C%, C3, ■ ■ ■ where C\ is the right-most column. 

The main purpose of this section is to give a bijective proof of the following 
theorem, which implies Theorem 12 . 1 1 from the introduction. 

Theorem 3.1. Let Kj denote the amount carried from column j to column j + 1 
when n length m numbers are added mod b. Let tj be the permutation obtained after 
the iteration of j b-shuffles, started at the identity. Then 

P(ki = ix, ■ ■ ■ ,K m = i m ) = P(d(n) =«!,-••, d(T m ) = i m ) 

for all values of ix, - ■ • ,i m - 

In preparation for the proof, some notation and lemmas will be needed. 

Lemma 3.2. Let k(Cj ■ ■ ■ C\) denote the amount carried from column j to column 
j + 1 when the corresponding j -tuples are added (adding consecutive j -tuples one at 
a time rather than adding a column at a time). Then n(Cj ■ ■ ■ C\) = Kj. 

Proof. This is clear since in calculating the carry to column j + 1 it is irrelevant 
how one adds the numbers in the preceding columns. I 

Given a length n list of j-tuples of numbers mod b, one says that the list has 
a descent at position i if the i + 1st j-tuple is smaller than the ith j-tuple. For 
example the following 3-tuples of mod 3 numbers: 

I 2 

1 1 

2 2 

1 1 
2 

2 I I 

has a descent at position 3 since 220 is greater than 101, and a descent at position 

4 since 101 is greater than 020. 

Given a length n list of j-tuples of numbers mod b, one says that the list has a 
carry at position i if the addition of the i + 1st j-tuple on the list to the sum of the 
first i j-tuples increases the amount that would be carried to the j + 1st column 
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(it might seem more natural to say that the carry is at position i + 1, but our 
convention will be useful). For example the following 3-tuples of mod 3 numbers: 



1 2 

1 2 

1 1 2 

1 1 1 

2 1 2 
1 2 1 



has a carry at positions 3 and 4. Indeed (0, 1, 2) + (0, 1, 2) = (1, 0, 1) which doesn't 
create a carry. Adding (1,1,2) gives (2,2,0) which still doesn't create a carry. 
Adding (1, 1, 1) gives (1, 0, 1) with a carry, so there is a carry at position 3. Adding 
(2, 1, 2) gives (0, 2, 0) with a carry, so there is a carry at position 4. Finally adding 
(1, 2, 1) gives (2, 1, 1), which doesn't create a carry. 

For what follows we use a bijection, which we call the bar map, on sets of j 
column vectors having length n and entries in 0, 1, • • • ,6—1. Given Cj ■ ■ ■ Ci, then 
Cj ■ ■ ■ C\ is defined as follows: the ith j-tuple of Cj ■ ■ ■ C\ consists of the right-most 
j coordinates of the mod b sum of the first i j-tuples of Cj ■ ■ ■ C±. For example, 

12 

1 2 

112 

C3C2C1 = l— *■> C3C2C1 

2 1 2 

1 2 1 



1 2 

1 1 

2 2 

1 1 
2 

2 1 1 



Indeed 012 + 012 = 101 giving the second line of C3C2C1. Then 101 + 112 = 220 
giving the third line, and 220 + 111 = 101 (retaining only the last 3 coordinates), 
giving the fourth line, etc. One can easily invert the bar map, so it is a bijection. 
The following lemma is immediate from these definitions. 

Lemma 3.3. Cj ■ ■ ■ C\ has a descent at position i if and only if Cj ■ ■ ■ C\ has a 
carry at position i . 

Given a length n collection of j-tuples of numbers mod b, we define an associated 
permutation ir by labeling the j-tuples from lexicographically smallest to largest 
(considering the higher up j-tuple to be smaller in case of ties). For example with 
n — 6, j = 2, 6 = 3, one would have 



/ 1 


2 \ 


4 


2 


1 


5 


1 





3 





1 


2 








1 


V 2 


1 J 


6 



since (0, 0) is the smallest, followed by (0, 1), (1, 0), (1, 2), then the uppermost copy 
of (2, 1) and finally the lowermost copy of (2, 1). Note that we use the standard 
convention for writing permutations, i.e. 1 1 — > 4, 2 1 — > 5, etc. We mention that this 
construction appears in the theory of inverse riffle shuffling 

Lemma 3.4. Cj ■ ■ ■ C\ has a descent at position i if and only if the associated 
permutation n(Cj ■ ■ ■ C\) has a descent at position i. 
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Proof. This is immediate from the definition of tt. 



To proceed define a second bijection, called the star map, on sets of j column 
vectors having length n and entries in 0,1, — 1. This sends column vectors 

Ay A 1 to (Aj ■ ■ ■ Ai)* defined as follows. The right-most column of (A, • • • Ax)* 
is A\. The second column in {Aj ■ ■ ■ A\)* is obtained by putting the entries of A2 
in the order specified by the permutation corresponding to right-most column of 
(Aj ■ ■ ■ Ax)* (which is A{). Then the third column in (Aj ■ ■ ■ Ax)* is obtained by 
putting the entries of A3 in the order specified by the permutation corresponding 
to the two right-most columns of (Aj ■ ■ ■ Ax)* , and so on. 



For example, 



1 2 2 1 2 
12 1 10 1 

AAA 200 , A A A ^* 220 

A 3 A 2 Ax = Q Q 1 ^(A 3 A 2 Ax)*= 1 Q 1 

2 1 2 
11 2 11 



Indeed, the right-most column of (^3^2^!)* is Ax- The second column of 
(A 3 A 2 Ai)* is obtained by taking the entries of A 2 (namely 2, 2, 0, 0, 1, 1) and putting 
the 2 next to the smallest element of Ax (so the highest 0), then the second 2 next to 
the the 2nd smallest element (so the second 0), then the next to the 3rd smallest 
element (so the highest 1), then the second next to the 4th smallest element (so 
the second 1), then the 1 next to the 5th smallest element (so the third 1), and 
finally the second 1 next to the 6th smallest element (so the only 2), giving 

1 2 
1 

2 

1 ' 
2 

1 1 



Then the third column from of (A3A2A1)* is obtained by taking the entries of A3 
(namely 1, 1,2,0,2,0) and putting the 1 next to the smallest pair (so the highest 
(0,1)), then putting the second 1 next to the 2nd smallest pair (so the second 
(0,1)), then the 2 next to the third smallest pair (1,1), then the next to the 
fourth smallest pair (1,2), then the second 2 next to the fifth smallest pair (the 
highest (2, 0)), and finally the second next to the sixth smallest pair (the second 
(2,0)). 

The star map is straightforward to invert (we leave this as an exercise to the 
reader), so it is a bijection. 

The crucial property of the star map is given by the following lemma, the j = 2 
case of which is essentially equivalent to the ll A B SzB" formula in Section 9.4 of [27j |. 

Lemma 3.5. 

n(Aj)...Tr(Ax)=n[(Aj-..Axn 
where the product on the left is the usual multiplication of permutations. 
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As an illustration, 

1 2 2 

1 2 1 

AAA 200 

A 3 A 2 A 1 = Q Q x 

2 1 
1 1 

yields the permutations 

tt(A 3 ) tt(A 2 ) ir(Ax) 



3 5 6 

4 6 3 

5 1 1 

1 2 4 

6 3 2 

2 4 5 



Also as calculated above, 

1 2 

1 1 

{A 3 A 2 A 1 )* = \ q J 

2 

2 1 1 

which yields the permutations 

n[(A 3 A 2 A 1 )*] 7r[(A 3 i4i)*] Tr^)*] 



1 4 6 

3 1 3 
6 5 1 

4 2 4 

2 6 2 

5 3 5 



n(Al) = ttCAx), 7r[(A 2 A)*] = 7^3)7^), and Tr^A^)*] = Tr^M^M^), 
and Lemma 13.51 gives that this happens in general. 

Proof of Lemma \3.5[ This is clear for j = 1, so consider j = 2. Then the claim 
is perhaps easiest to see using the theory of inverse riffle shuffles. Namely given a 
column of n numbers mod 6, mark cards 1, • • • , n with these numbers, then bring 
the cards labeled to the top (cards higher up remaining higher up), then bring 
the cards labeled f just beneath them, and so on. For instance, 

2 3 
f 5 

2 

1 ~ 4 • 
6 

f f 
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Note that (in the notation of the example) this is 7r(Ai) 1 . Now repeat this process, 
using the column 

2 
2 


' 

1 

1 

to label the cards, placing the labels just to the left of the digit already on each card. 
A moment's thought shows that this is equivalent to a single process in which one 
labels the cards with pairs from (A 2 Ai)*. Thus tt[(A 2 Ai)*]^ 1 = Tr(Ai)^ 1 TT(A 2 )~ 1 , 
so that 7r[(A 2 Ai)*] = 7r(j4 2 )7r(Ai). The reader desiring further discussion for the 
case of two columns is referred to Section 9.4 of the expository paper j2?| ■ The 
argument for j > 3 is identical: just use the observation that iterating the procedure 
three times is equivalent to a single process in which one labels the cards with triples 
from (A 3 A 2 Ai)*. ■ 

With the above preparations in hand, Theorem 13.11 can be proved. 

Proof of Theorem \3.1l To begin, note that 

ki = h,--- ,K m = i m <-> K(Cj ■■ -Ci) = ij (1 < j < m) 
<-> d(C r --Ci) = ij (1 < j < to) 
«-> d(-K{Cyd)) = ij (1 < j < m). 

The first step used Lemma I3~2l the second step used Lemma |3~31 and the third step 
used Lemma T3. 41 

Let Am ■ ■ ■ Ax = (C m ---Ci)-*. Then Aj ■ ■ ■ A x = {Cj---C x )~* for all 1 < j < 
to, and Lemma 13.51 implies that 

d[ir(Aj) ■ ■ ■ Tr^i)] = d(ir[{Aj ■ ■ ■ A,)*]) = d[n(C r ■ ■ C,)] = ij 

for all 1 < j < to. Now note that if C m • ■ ■ G\ are chosen i.i.d. with entries uniform 
in 0, 1, • • • ,6—1, then the same is true of A m • ■ ■ A\ since the bar and star maps are 
both bijections. Note that each ir(Ai) has the distribution of a permutation after a 
b-shuffle, so one may take Tj to be the product n(Aj) ■ ■ ■ ir(Ai), and the theorem is 
proved. I 

Remark and example: The above construction may appear complicated, but we 
mention that the star map (though useful in the proof) is not needed in order to 
go from the columns of numbers being added to the r's. Indeed, from the proof of 
Theorem 13 . 1 1 one sees that the Tj's can be defined by Tj = Tr(Cj ■ ■ ■ C\). Thus in the 
running example, 



C3C2C1 — 















73 


T-2 


Tl 





1 


2 





1 


2 


1 


4 


6 





1 


2 


1 





1 


3 


1 


3 


1 


1 


2 ^ 


-> C3C2C1 = 2 


2 


H 


+ 6 


5 


1 


1 


1 


1 


1 





1 


4 


2 


4 


2 


1 


2 





2 





2 


G 


2 


1 


2 


1 


2 


1 


1 


■5 


3 


5 
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Observe that Ki — 3, «2 = 3, K3 = 2, and that d(ri) = 3, g?(t2) = 3, g?(t3) = 2 as 
claimed. 

As a corollary of Theorem 13.11 we deduce that the descent process after riffle 
shuffles is Markov (usually, a function of a Markov chain is not Markov). 

Corollary 3.6. Let a Markov chain on the symmetric group begin at the identity 
and proceed by successive independent b-shufftes. Then d{-K), the number of descents, 
forms a Markov chain. 

Proof. This follows from Theorem 13.11 and the fact that the carries process is 
Markov. ■ 



4 Applications to the Carries Process 

As in previous sections, let Kj be the amount carried from column j to column 
j + 1 when n length-m numbers are added mod b. Suppose throughout this sec- 
tion that the "digits" of these numbers are chosen uniformly and independently in 
{0,1,- •• ,6-1}. 

Theorem 4.1. For 1 < j < m, the expected value of Kj is /j,j = (l — i). The 
variance of Kj is a 2 j = i^jj- (l — pr)- Normalized by its mean and variance, for 
large n, Kj has a limiting standard normal distribution. 

Proof. From Lemma \3. 31 of [Section 3[ Kj is distributed exactly like the number of 
descents among the n rows of the right-most j digits of the random array. The 
distribution of these descents is studied in [8[ where they are shown to be a 2- 
dependent process with the required mean and variance. The central limit theorem 
for 2-dependent processes is classical Q. ■ 

Remarks: 

1. Note that tij,o-j are increasing to their limiting value ^j^, as j increases. 

2. Let S m = K± + K2 + • • • + K m be the total number of carries. By linearity of 
expectation and Theorem 14. 11 this has mean 

^^^^-bhi 1 -^))- 

When n = 2, this was shown by Knuth [2g, p. 278]. He also finds the variance 
of S m when n = 2. For fixed n and 6, the central limit theorem for finite state 
space Markov chains [6( shows that S m , normalized by its mean and variance, 
has a standard normal limiting distribution. 

3. The fine properties of the number of carries within a column is studied in [3] 
where it is shown to be a determinantal point process. 

As shown above, the carries process Kj,0 < j < m (with kq = 0) is a Markov 
chain which has limiting stationary distribution 7r(j) = A(n, j)/n\. To study the 
rate of convergence to the limit we first prove a new property of the amazing matrix 
P(i,j) of (HI). Recall that a matrix is totally positive of order two (TP2) if all 
the 2x2 minors are non-negative. 

Lemma 4.2. For every n and b, the matrix P(i,j) of (HI) is TPi- 
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Proof. As noted on p. 140 of [23J], 

n+l 



where [x l ]f(x) denotes the coefficient of x l in a polynomial /(a;). Thus the transpose 
of P is a submatrix of the matrix with (i, j) coordinates [x l ~J] [(1 — x b )/(l — x)] n+1 . 
Since the product of TP2 matrices is TP2, it is enough to treat the case n = 0. 
Now, the matrix is a lower triangular, n x n matrix with ones down the diagonal, 
ones on the next lowest b — 1 diagonals and zeros elsewhere. For example, when 
n — 6, b — 3 the relevant matrix is 

1 
1 1 
1110 
1110 0' 
1110 
1 1 1 

By inspection, 13 of the 16 possible 2x2 matrices can occur as minors. The missing 
ones are 

01 01 11 
10 11 10 ' 

these being the only ones with negative determinants. ■ 

Remark: When b = 2, the original P(i,j) = 2 _n ( 2 ™^_ 1 ) is totally positive 
(TPoo). Indeed, P(i,j) = 2~ n [x^- l+1 ]{\ + x) n+1 . Let i 1 = i + 1, j 1 = j + L This 
becomes 2~™ [a; 2 - 7 ~' ](1 +x) n+1 . Each minor of this is a subminor of 2™"[x : ' ~ % } (1 + 
x) n+1 . This is totally positive by the classification of Polya frequency sequences 
due to Schoenberg and Edrei ([25(, Chap. 8). 

Consider the basic transition matrix P(i,j) for general b, n. This has stationary 
distribution 7r(j), < j < n, given in (H4). The carries Markov chain starts at 
and the right-most carries tend to be smaller. This is seen in Theorem 14.11 It 
is natural to ask how far over one must go so that the carries process is station- 
ary. If P r (0, j) is the chance of carry j after r steps, we measure the approach to 
stationarity by separation 



sep(r) = max 



Thus < se p(r) < 1 and sep(r) is small provided P r (0,j) is close to ir(j) for all 



j. See [2J or 14| for further properties of separation. The following theorem shows 



that convergence requires r = 2 log fc n. 

Theorem 4.3. For any b > 2, n > 2, the transition matrix P(i,j) of (HI ) satisfies 

1. For all r > ; the separation distance sep(r) of the carries chain after r steps 
(started at 0) is attained at the state j = n — 1 . 

2. For r = 21og 6 (n) + log b (c), 

. . i_ 

sepyr) — > 1 — e 2c 

if c> is fixed and n — > 00 . 
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Proof. By Lemma 14.21 the matrix P(i,j) is TP 2 . Thus the matrix P*(i,j) := 
[P(j,i)ir(j)]/ir(i) is also TP2, since every 2x2 minor of P* is a positive multiple 
of a 2 x 2 minor of P. Now consider the function f r (i) — P r (0,i)/ir(i). We claim 
that P* f r = fr+i- Indeed, 



= E P *(^') 



E 



3 



7r(i) ?r(j) 



_ P r+1 (0,z) 

7T(i) 

Now the "variation-diminishing property" (p. 22 of [25() gives that if / is mono- 
tone and P* is TP2, then P*/ is monotone. Since /o is monotone (the walk is 
started at 0), it follows that f r is monotone, i.e., that the separation distance s(r) 
is attained at the state n — 1. 

For the second assertion, note that by the relation between riffle shuffling and 
the carries chain in Theorem l3.11 P r (0, n—1) is equal to the chance of being at the 
unique permutation with n — 1 descents after r iterations of a 6-shuffle; by [B[ this 
is &~ rn O- Thus 



sep(r) = 1 



P r (0,n- 1) 
ir(n — 1) 



/n-l 

1 - exp E lo g ( 1 



\i=i 

Letting b r — en 2 with c > fixed, this becomes 



e 2c 



as n 



Remark: It is known || that it takes r = 2 log h n 6-shuffles to make separation 
distance small on the symmetric group. Via Theorem 13. 11 this shows 21og h n steps 
suffice for the carries process. Of course, fewer steps might suffice but Theorem 
4.31 shows the result is sharp for large n. In mild contrast, it is known [l|, [E| that 
(3/2) log 2 n "ordinary" (b = 2) riffle shuffles are necessary and suffice for total vari- 
ation convergence. We can show that for b = 2, log 2 n carry steps suffice for binary 
addition. Our argument uses the monotonicity proved above, the first eigenvec- 
tor from [23j, and Proposition 2.1 of 16]; for a second argument, using symmetric 
functions, see [ll|. We do not know that this upper bound is sharp; the best total 
variation lower bound we have is (1/2) log 2 n. 
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5 Three Related Topics 



The "amazing matrix" turns up in different contexts (sections of generating func- 
tions) in the work of Brenti-Welker . There is an analog of carries for multipli- 
cation which has interesting structure. Finally, there are quite different amazing 
matrices having many of the same properties as Holte's. These three topics are 
briefly developed in this section. 

5.1 Sections of generating functions 

Some natural sequences a^, < k < oo have generating functions: 

f>"=u^r (•) 

with h{x) = ho + h\x + ■ • ■ + h n+ ix n+1 a polynomial of degree at most n+1. For 
example, the generating function of a% = k n has this form with h{x) the Eulerian 
polynomials of (H4"). Rational generating functions characterize sequences {ah} 
which satisfy a constant coefficient recurrence (2||. They arise naturally as the 
Hilbert series of graded algebras ([3], Chapter 10.4). 

Suppose we are interested in every r-th term {a r/ t}, < fc < oo. It is not hard 
to see that 

^ k _ h< r >(x) 

hC rkX ~ ( l ~ x ) n+l 

for another polynomial h <r> {x) of degree at most n + 1. Brenti and Welker [§] 
show that the i-th coefficient of h <r> (x) satisfies 

n+1 

fc i < r> = EC'(i,J>J 
j=o 

with C an (n + 2) x (n + 2) matrix with entry (0 < i,j < n + 1) equal to the 

number of solutions to a\ + h a n+ i = ib — j where < a; < b — 1 are integers. 

The carries matrix is closely related to their matrix. Indeed, remove from C the 
i = 0, n + 1 rows and the j = 0, n + 1 columns. Let i' = i — 1, j' = j — 1. This 
gives an n x n matrix with entry (0 < < n — 1) equal to the number 

of solutions to fti H h a n +i = («' + 1)6 — {.]' + 1) where < a; < b — 1 are 

integers. Multiplying by b~ n and taking transposes gives the carries matrix for 
mod b addition of n numbers (see the top of p. 140 of |23|). Brenti and Welker 
develop some properties of the transformation C . We hope some of the facts from 
the present development (in particular the central limit theorems satisfied by the 
coefficients) will illuminate their algebraic applications. 

5.2 Carries for multiplication 

Consider the process of base b multiplication of a random number (digits chosen 
from the uniform distribution on {0, 1, • ■ ■ , b — 1}) by a fixed number k > 0. We 
do not require that k is single-digit. Then there is a natural way to define a carries 
process, which is best defined by example. Let k — 26 and consider multiplying 
1423 by 26 base 10. The zeroth carry is defined as Ko = 0. To compute the first 
carry, note that 26 x 3 = 78, so k\ = 7. Then K\ + 26 x 2 = 59, so «2 = 5. Next 
k 2 + 26 x 4 = 109, so k 3 = 10. Finally, n 3 + 26 x 1 = 36, so k 4 = 3. 
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It is not difficult to see that the above process is a Markov chain on the state 
space {0, 1, • • ■ , k — 1}. For example, if b = 10 and k — 7, the transition matrix is 



K(i,j) = — 



1 2 



1 1 



1 1 



1 
1 

1 1 

2 1 



1 1 

2 1 
2 
2 



2 
2 

1 2 
1 1 



1 1 
1 
1 
1 

2 



The matrix above K(i,j) does not have all eigenvalues real, but the following 
properties do hold in general: 

• K(i,j) is doubly stochastic, meaning that every row and column sums to 1. 

• K(i,j) is an generalized circulant matrix, meaning that each column is ob- 
tained from the previous column by shifting it downward by b mod k. 

• Fix k and let K a , Kf, be the base a, b transition matrices for multiplication by 
k. Then K ab = K a K b . 

The first two properties are at the level of undergraduate exercises, and Chapter 
5 of 13[ is a useful reference for generalized circulants. The third property holds 
for the same reason that it does for Holte's matrix (see the explanation on page 143 
of (U). 

Since K is doubly stochastic, the carries chain for multiplication has the uni- 
form distribution on {0, l,--- , k — 1} as its stationary distribution. Concerning 
convergence rates, one has the following simple upper bound for total variation 
distance. 



Proposition 5.1. Let Kq denote the distribution of the carries chain for multipli- 
cation by k base b after r steps, started at the state 0. Let tt denote the uniform 
distribution on {0, 1, • • • , k — 1}. Then 

3=0 



Proof. Observe that 



-\{x:jb r <kx< (j + l)b r ,0<x<b r }\. 



The number of integers x satisfying < x < 

Hence \Kg(j) — < 4r, and the result follows by summing over j. 

Convergence rate lower bounds depend on the number theoretic relation of k 
and b in a complicated way. For instance if k = 6, the process is exactly random 
after 1 step. 



is between k- — 1 and \- + 1. 
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5.3 Another amazing matrix 



From one point of view, Holte's amazing matrix exists because there is a "big" 
Markov chain on the symmetric group S n with eigenvalues 1, 1/6, 1/6 2 , • • • and a 
function T : S„ — > {0, 1, • • • ,n — 1} with image this very same Markov chain. Of 
course, the interpretation as "carries" remains amazing. There are many functions 
of the basic riffle shuffling Markov chain which remain Markov chains. Here is a 
simple one. Consider repeated shuffling of a deck of n cards using the Gilbert- 
Shannon-Reed 6-shufnes. The position of card labeled "one" gives a Markov chain 
on {1, 2, • ■ • , n}. In [4| the transition matrix of this chain is shown to be 

Qb(i,j) = -^x (6) 



££(V)6-r-l) h " {b ~ k)J ^ r{k ir " r{b - k + 1) 

h=l ' 



(n— i)— (*— r—l) 



where the inner sum is from t = max(0, (i + j) — (n + 1)) to u = min(i — l,j — 1). 
For example, when n = 2, 3 the matrices are 

J_ fb + 1 6-1 
26 ^6 - 1 6+1 

(6+l)(26+l) 2(6 2 -l) (6-l)(26-l) N 

2(6 2 -l) 2(6 2 + 2) 2(6 2 -l) 
(6-l)(26-l) 2(6 2 -l) (6+l)(26+l) y 

The matrix Qb is shown to satisfy 




Qb has eigenvalues 1, 1/6, 1/6 , • • • , 1/6 



• The eigenvectors of Qb do not depend on 6; in particular, the stationary 
distribution is uniform: = 1/n, 1 < i < n. 

• QaQb = Qab- 

We guess that Qb has other nice properties and appearances. 
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